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REFERENCES 147 M. ABSTRACT OF THE INVENTION 212 CLAIMS 

i 1 i: 1. A met neratmg i i 

the hapiotypes for a; leas: one locus from t individuals in h i I a i h method comprising: (a) for each individual 

- x > ! t - generating polymorphism and haplo v p ii 
and hapiotypes for the locus; and 1) (b) storing the polymorphism and hapiotype data elements for the individuals In a 

o i jt r i l j >. i. o the spatial relationships between the 

polymorphism nd lotvf ' rice for the locus;. 

2. The method of claim 1, whit t he v. . . 

haplotyj lapic airs for the gen< / i 

3. The method of claim 7, wherein the deriving step comprise:; ascertaining the frequency of the; hapiotypes and hapiotype 
pairs accorc *<g to the Hardy ■ V\ 

4 mx ' x i i h r , 'i ' I a f it, . > . x t 1 1 u I A\ c 

sequence of the gene or the gone feature from a first chromosome and a second chromosome in each Individual in the 
population to generate a plurality of nucleotide sequences for the population; (c) aligning the plurality of nucleotide 
sequences for the populatio > n. i <x. om the aligned s« ng two hapiotypes for 

each individual as a hapiotype pair for storage in a table in the database. 

5. The method of claim 4, wK ^ . v 

6. The method of clain i g comp! ses corret. t L Jo tribution of hapiotypes or hapiotype 
pairs for effects; imposed by a limited number of individuals in the population. 

7. The method ot claim 5, wherein the validating also comprises analycmg compliance ot <. observed >. i j i with 
fvlendelian Inheritance principles. 

8. The tr t n t f t 5 e i 5 j jui.it Jti, a clinical 
population., a disease populatio m:: f c pi n r >: popuia m 

\ n tint t t j ol/pc lot tx 

t 1 iii in " da abase 

containing referenc - .mt i rr u :^ I a a olotype pairs 

ic obtype pairs 

lor the individual. 



10. The method of claim 9, wherem the identifying step sl 1 > ^ < > . . . . > s . ; . -j . " I 

1 1 . The method of claim 1 0, wherein the determining includes calculating phylogenetic and/or linkage information for the 
reference hapiotype pairs. 

! i t i r >_r i i t j. j hi ^ j hi 

frequency in the; database 

13. A method for identifying a correlation between a hapiotype pair and a clinical response to a treatment, or other 

>typ i i c I ) c x i f t i >c nt n i i >n t t -i h 

phenotypes e!.hiG]!c^ by - J n ca popi. at - b se - 9 a cmid on f \ < 3 \ ed wi h the 

clinical response or other phenotype, the locus comprising at least two polymorphic sites; (c) providing hapiotype data for 
each member of the c I c otype data compns , i phic sites 



present in ;he candidate locus: (d) storing the haplotype data: and (s) calculating the degree of correlation between 
haplotype pairs and the clinical re to a f nen r other phex>type, I h, zing the haplotype and 

clinical response data. 

14. The method of claim 13 wherein step (e) Is performed last. 

15. The method of claim 13 wherein step (a) is performed before any one of steps (b), (c) or (d). 
16 ? method el claim 1 3 wh ted ', and (d 

17. The method of any one of claims 13-16, wherein the treatment comprises administration of a drug or drug candidate. 

18. The method ot claim 17, wherein the candidate locus is; a gene or a gene feature. 

19. The method of claim 1 3, further comprising displaying or ouiputting the correlation. 

20. The method of claim 19, further comprising calculating the statistical significance of the correlation. 

21 . The method of claim 20. wherein the providing haplotype data step comprises (a) providing a genotype for the 
i J> - ' i- J! tsng a possible haplotype pairs which are consistent with the genotype; (c) determining a 
probability for each possible haplotype pair thai the individual has that possible haplotype pair, by accessing a database 
containing I * n " 1 or haplotype pairs in a reference population; and (d) analyzing the determined probabilities to 
infer the individual's haplotype pair. 

22. A method for identifying a correlation between a haplotype pair and susceptibility to a condition or disease of interest, 
or other phenotype of interest, comprising the steps of : (a) selecting a candidate locus hypothesized to be associated with 
the phenotyf - ~ . n " j iirerest, the locus comprising at least two polyn e s <. isi providing 

t \. i he cand dale lo u n t j. i t i «. if n o' disease of 

<■ organizing the disease hapiotype data in a database w if, ng the 

disease haplotype data to calculate haplotype pair frequencies: le; accessing a database containing haplotype data for the 
candidate locus for each member of „ ^ \ A ' \ n 

the reference haplotyne r -Aa t \\ nJ <- \ i h high 

t i y in the population a, t t ^ -> n v - -> - n hi n r healthy reference population, 

Identifying a correlai Jf. bility to the disease or condition of interest. 

23. The method of claim 22 wherein step (f) is performed after step (d). 

24. The method o; claim 22 wherem step (e) is performed before any one of steps (b), (c). or id). 
25 ' he met xsd of claim 22 wherein step (< s perto c or any one t a i c), or (d 

25. The method of any one of claims 22-25, wherein the candidate locus Is a gene or a gene feature. 

27. The method of cla 1 " te t. ouiputting the identified correlation. 

28. The method of claim 27, further comprising calculating the statistical significance of the Identified correlation. 

29. The method of claim 28, wherein the providing haplotype data step compriss - p jvidi j 3 a n< ," " - h 
individual, (bi enume at - -* . ■> ^ c~ are consistent with the genotype; (c) for each possible 

Ji ^ _uai has that haplotype pan j a cess x 
freque ay data 5 1 n - eierence population n^r d nfe ng "< n<*i > ' c 3$ed on the 

determined probabilities. 



30, A method of predicting an Individual's, response fo a medical or pharmaceutical treatment, comprising: fa; selecting at 
Seast one candidate gene tor which a correlation between t t- 1 content and response to the treatment has been 

i it 11 ] h i\ > h i 1 1 i > 

individual's response will be the response associated hapioiype pair with Information on the correlation. 

31 . The method of claim 30, wherein the selecting step comprises outpuHing a list oi candidate genes associated with 
diiferen pc s - Ik osfn nt. 

3; f nefhoc < lairn 1, iurth< xmsin 



34. A computer implemented method for generating a gene structure screen for display on a display device, comprising 
the steps of : is) retrieving from a database and displaying in a first area data indicative of the frequencies; of occurrence 

3 from a database 

and displaying in a second area data Indicative of the frequencies of oeeu' e <\,-> 
groupings; (c) retrieving from a database data indicative of ge * • \ u - d -;'i j j 1 . .hud area a graphical 
reprs nta c , r x c c c ^pd ; i' ih sfi.to 

cause the appropriate nucleotide frequencies to be displayed In the second area. 

35. A computer implemented method for generating a haploiype pair frequency screen f display or; a display device, 
comprising the steps of : (a) ctsplaying n a I k taole kerns each corresponding to a polymorphic 
site for a predetermined gene; (b) selecting one or more of said selectable items: (c) displaying In a second area the 
haploiype pair x m nj i i n ? oo jlaiton or the x x d c A, v i te x »'ing in a hird area data 
indicative of haploiype frequencies for a plurality of member groupings within the population. 

36. A computer implemented method for genera it step 
of : (a) displaying in a r area a graphical scale showing a reference tor defer mining progressive degrees of linkage 
between polyn o - - as i a popula on b) displaying in a second area a graphic a ct having a plurality of 
l 1 i II Jisplays 
an indication oi degree of linkage between polymorphic siles corresponding to that grid, in accordance with the reference 
shown in the first area, 

r ml U Jat i " v. 

33 ' t im ml mil li I f ' 

he £ p < ( p! s n - i l i < I f > ems each corresponding to a polymorphic site for a 

predetermi ed qe ^ i a " ! i f 

structure having nodes for each haploiype in a population, where the distance between nodes Is Indicative of the number 
of nucleotides that would h o be flipped tc i ;!c;yt a not her. 

e nodes are connected by Sinks that indicate a single nucleotide difference between 

nodes. 

i 3 st i the noc iisolay an Indieaf ic fre jrt 

the haplotype represented by the node, 

41 . A computer irnj 

i lift il " > a^c a 

plurality of second selectable Items each corresponding to a po nc 

reference for dete mmln - progressive degrees of haplo yce rx d .obvrg a 

y xjIiic A malr > £ ruUiro having i p \ a Jit> c ot d ' - , < > r - " "ti > < ' x ! > l I > die irsl 

r - Ire--; r , 1 1 cat on of degree of identification reliabi y for identifying 

the haplotype corresponding to that grid using genotyping specified by the second selectable Items, In accordance with the 



42. I he method of claim 41 , wherein the indication of degree is color. 



43. A method of displaying clinical response values of a subject population as a function of haplotype pairs of the 
individuals in the population, comprising: (a) receiving from a computer-readacL- t age c - e data representing 

s ^ 3 the subject pops. t . w matrix 

each of whose coils contains a graphics! representation 1 t clinical response values of individuals hav:r;g ;ho haplotype 
pair corresponding to that cell of the haplolype pair matrix, 

44. A method ' ciispiaying clinical response values of a subject population as a function ' haplotype pairs of the 
indvioual in iho popul r t r r r i < 
for a predetermined gene, which when selected, will generate haplolype pairs; (b) displaying a second selectable item 
representing a clinical response measurement; i when selected In conjunction f the first selectable Hems Will 
uiu < f I w J a il >t (. ii r i "i ;ec a - - - t. ; I tc il i i n 
values or the selected clinical i v h lapiotype pair corresponding to that cell of the 
haplotype pair matrix. 

45. The method o; claim -13 o- 44, wherein the graphical representation of clinical response values is a color scale or gray 
scale, the shade of each cell being proportional to the mean clinical response value o v v ; \ ,ybv „ 

piiOIH }> UK I j Ik II ii I N 

46. The method of claim 45 fu the i Ji flaying a means for adjusting the range of mean clinical response 
values represented by the color scale o; gray scale, wherem adjustment of the range causes i displayed shade of color 
or gray ol the cells of the haplotype pair matrix to be adjusted accordingly. 

1 f iPt\' 1 1 -i t ^ v v - ibulion of 

individuals across the range of clinical response values. 

48. The method of any one of claims 43,44, or 45 wherein at least one cell includes a selectable area which, when 
selected, will cause the display of a histogram indicating the distribution of individuals across the range of clinical response 
values. 

49. The method of any one of claims 43,44 or 45 which further comprises displaying a selectable item which, when 
kt i . . \ i <. da! polymorphic 

sites and the clinical response values. 

50. The method of claim 43,44 or 45 which furthe ;omp a\ m wine ^hen selected, d plays 
the numerical mean and standard dev alien o oliri 1 J ! t r » •> r in 
the matrix. 

51 . The method of claim 43,44 or 45 which further comprises displaying a selo v i i r ,»hc rt s-le-ted, causes 
he display of f|,-- - ■> , -i t - - - - li i -I 
response values between individuals having different haplotype pairs Is statistically significant, 

52. A computer-implemented method for carrying out a genetic algorithm for finding an optimal set of weights to fit a 
function of pol i c _ Ji J t c t f-c ^ i i 

ling t i 1(1; 3 a v unable oc nt c Ihr for >ct1r } the ' .inter 

of agents paramotei t~ -de n - cr^ t j r , playing a variable 

controller for setting the crossover rale parameter; (e) displaying one or more selectable items each corresponding to a 
polymorph i i displaying a sek nt h jenetk algorithm 

\ v ( < i n i i t i i etc i <\s\, ' , - , i ^ \ < \ , < I 

v - a ion, rcsuks in the evecislin^ of the genetic algorithm calculation with the 

parameters set by the variable controllers, and the display of the residual error of the model as a function of the number of 
ie <cj t,lts c the the optimal weights 

for each of the polymorphic sites. 



53. A ui ~ j! r method tor displaying correiations between ciinicai outcome values for a selected population, 
comprising 2) (a) o cplav i i i t of selectable items corresponding 'o 1 . - )les 3i (o 1 
displaying a it > ^ i 3 < pitying a 

> nt r » n-i i j ti •> selected population c i cling first item from 

fx st f It ^ \- u i - 1 1 1 Jo \ w ht ^ ■> ^ x ^ 

value of the corresponding clinical outcome valut o t h c J heiem selec ion 

ol a second item from the second plurality of seiectabie items causes each del a point to be plotted on the y axis of the 
scatter piot ic-o c i r r ending ciinicai outcome value for the Individual associated with the data 

point. 

54. A ti t >d for conducting finical trial of a treatment f ocoi to nedical condition of inters comprising: (a) 

! j ther loci) <nown or expected to be involved in a particular disease or drug response: (b) 
) a reference popi latic ^ e x a broad and :epresenta'ive genetic .. ^ ^ . ^ cine 

DST i n )l i ;( c ) nl th ( la ko qc x > (or 

othe loc, oi eac" men ~* c im r requeues population distributions and 

> for each of th« . x . trial population of 

individuals who have the medical condition of interest. ( i treating individuals In me trial population according to h 
treatment or j , < v \ - , , - , of ht k f i 

genes (or other loci) for each member of the trial population: fit determining the correlations between individual responses 
to tfx I Jnuil i oi \ uu 3 j° t. o n r 

correiations., constructing a model that predicts the response of an x ' 1 divdua! 

hapiotype content, 

55. The method of claim 54, further comprising I he slop of denving from the: hapiotype distribution found for the reference 
population a reduced set of genotyping markers, which alio -. < \ iuai t hapioiypes io be accurately predicted without 
conducting a compkle r ne analysis, and using the reduced set of genotype markers io determine 
hapiotypes in step (h). 

56. A method of inferring genotypes of individual subjects for a selected gene havmg at feast m polymorphic sites, 
comprising ja) providing a dalabase of m-xhe hapiotypes of tlx-: selected gene from a representative cohort of individuals: 

(b) E3billc3f 111 the f c ] I r i 

result from ail possible pairs of observed hapiotypes; (d) calculating the expected frequency of these genotypes assuming 
the Hardy-Weinberg equl bt uir e <. h i xe length m as the 

' sol fp i 1 <_ t r n c -\ < f i < i r i 

nucleotides at the of h x m m t s I i nidi - i - > i x > j t 
polymorph! i- . ^ -x among those masks having an acceptable level of 

it i i i i c he by measuring only the n 

X K t >l [. (. ill 311 ! ll 3 

haplolype. tfie full m-site hapiotype of a member o he In . 1 spbtype. 

57. The method of claim 55, wherein the calculation of ambiguity for a mask comprises (a) iderilitying all pairs of 
genotypes that 3 tondered identical by \ lir I or t mask: I i calculating ihe geometric mean ot the calculated 
Hii ~ a i mn ng all such geometric means for all 
ambiguous pairs to obtain an ambiguity score for the mask. 

v> i n t b a application of the select c € c; i£ es n tmbigulty in that two 

hapiotype pairs A and B exist fhai could explain a given genotype;, and the Hardy-Weinberg equilibrium predicts 
probabilities PA and PB, where PA+PB=1 , the assignment of a hapiotype pair is carried out by a process comprising (a) 
selecting a random number between 0 and 1 ; (b) if the random number is less than or equal to PA, assigning the 

>. e I I 1 i h Jl f y 3! E 

59. A method of determining polymorphic sites or sub-haplotypes thai correlate wiih a ciinicai response or outcome of 

'iinxJio\ snd clinical response or outcome data (clinical outcome values; 
from a cohort of subjects; (b) statistically analyzing each individual SNP in the hapiotype; for the degree to which it 

x v ' f a rwroc nc al measure of the degree of correlation; (c) saving for 

nc vioual SNPs whose numerical measure of ihe degree of correlation wiih ihe ciinicai outcome 
f t c t cul >f value d c aling all possible f lbinatfon f- ? i 



a set of n-site sub-hapiotypes where n = 2; (e) statistically analyzing each newly generated n-siie sub- hapiotype for the 
degree tO'-.ic I ! i i i >. t . c degree of 

correlation: ■'] saving lor further processing -hose n-s.it-; sub-hapiotypes who:;-; numerical measure of Iho degree o; 
>,<->, e ' ir ' ' in;; clinical , -c - - values exceeds t" - ,1 ol value; i generating - possible p;;i;-«se 
combinations among and between the saved SNPs and saved sub-hapiotypes, to produce new subhaploiypes with 
increased values of n; (h) repeating steps [e) through t'g) until either tp no new sub- haplotypes can be generated, or (ii) 
no further sub-hapiotypes having r v. > \u - i 

60. The method o! claim 59, further comprising t step of displaying these saved SNPs and sub-hapiotypes whose 

if i t fie degree ot ! i h ir eel outcome value exceeds a second cut-off value, wherein 

the second c jt-oif u grs i hi t- ;ut- of; vales 

61 , The method ot claim 59, wherein the numerics! measure of degree of correlation is repfaced by the p-vaiee for the 
correlation, and SNPs and sub- haplotypes are saved if the p-value is less than a first cut-off value. 

S2 '"o mel'^i r rhi i > I r -> i of > ( i <od SNPs and sub- haplotypes whose p- 

value for the correlation with the clinical outcome value Is less than a second cut-off value, wherein the second cut-off 
value is less than the first selected value. 

63. The method of any one of claims 59-62, further comprising the step of excluding from further processing complex 
subhapiotype whicl a cons uck rr ms r sub-hapiotypes, where the smaller sub-hapiotypes each have 
correlation values thai are at least as significant as that of the complex sub- hapiotype. 

64. A method of deternr c r ; , i on i k;u 'jthtype a > >> < fi . I m . ^ p< - e or outcome of 
interest, comprising: (a) providing c - , - -> i -> r n < t rr rl nica! response or 
outcome data, from a cohort of subjects; (b) statistically analyzing each single gene hapiotype for the degree to which ii 
correlates with the clinical response or outcome of Interest, and calculating a n m c il mc is ..e of the degree of 

i u 3 in , k i j. s i ho Kid i i ("ft It! 

he d 3l i c x n c ^ -> , < ^ n 

polymorphic sites, generating all possible sub-hapiotypes having a singie site masked, so as to provide a set of sub- 

ueiated sub-hapiotype for the degree 
is i > ! fi 3 3 num i i i 

r correlation; i saving for further processing i:fio:;e subhaploiypes whose numerical measure of u degree of i 1 1 it m 
with the clinical response or outcome of interest exceeds the first selected value; (g) from the saved sub-hapiotypes, 

atmg all possible sub h i q hroug -> (gt unt 1 erher 

■ I) no new sub- haploiyc n < h f "I < < urthersub- 

;iolyp< ing moi ■ ked si I i m e! < co limit can be generated. 

65. The method ot claim 54, further comprising the step of displaying those saved sub-hapiotypes whose numerical 
measure of the degree of correlation with the clinical response or outcome of interest exceeds a second selected value, 
wherein the second selected value is greater than the first selected value. 

66. The method of claim 64. wherein the numerical measure of degree of correlation is replaced by the p-value for the 
I'nisPir ^ . ^ selected value. 

67. The method of claim 66, further comprising the step of displaying those saved sub-hapiotypes whose p-value for the 
correlation wrf >. ^ _ >. ^ (. c i >. _, t a 3 second selected value, wherein the second 
;ele< too vs lue is less th hs t ss c d vah 

c ) in t "4 itt nth, Ii t t f c s x tompiex 

subhaplotypes which are constructed from smaller sub-hapiotypes, where each of the smaller sub-hapiotypes has 
correlation values that are at least as significant as thai of the complex sub- hapiotype. 

69. A computer- usable medium having computer-readable program code stored thereon, for causing a computer to adjust 
observed hapiotype pair frequencies within a population group, said hapiotype pair frequencies being stored in a 



computer-readable da i ) e otype information for a gene or gene feature of interest, the computer-readable 

program code comprising: (a) computer-readable program code for causing a computer to access said database and 
id i iii computer-readable program code lor 

i >i t r \ i t xp ct f In n airs according to the 

Hardy- Weinberg equilibrium, based upon the observed distribution of haplotypes or hapiotype pairs in the population; and 
;> co-npu v t c . J ' - i t . ' i i to select the most probable hapiotype pair for the individual 

based on the observed 



70. The computer-usable medium of claim 69, further comprising computer- readable program code stored thereon for 
causing is computer to coned the stored distribution oi hapiotypes or hapioiype pairs for effects imposed by ihe presence 
of 3 li nited ;mber c ; i i ; - n r popi I ; o 



71 . The computer-usable medium oi claim 69, further comprising computer- readable program code stored thereon lor 
causing a computer to validate hapiotype pair assignments by analyzing for compliance of the assigned hapiotype pair 
with Mendelian inheritance principles. 



72. The corn outer- usable medium of claim 89, wherein the population is selected from the group consisting of a reference 
population, a clinical population, a disease population, an ethnic population, a family population and a same-sex 
population. 



73. A computer-usable medium having computer-readable program code stored thereon, for causing hapiotype pair 
3 s Jim i i i N . ^ v \ . . c < c ' l 

of nte es t is stored In a computer- readable form, the computer- readable program code comprising: (a) computer- readable 
program code fo - cs e to generate all possible hapiotype pairs consistent with ihe stored genotype; (b) 

computer-readable program code for causing a computer to access a database containing reference hapiotype pair 
frequency data and to determine trem the frequency daia the probability. tor each oi ;he possible hapiotype pairs. t the 
individual has the possible hapiotype pair; and (c) computer-readable program code for causing a computer to select the 
most probable hapiotype pair for the individual. 



readable program code comprising: {a; computer- readable program code for causing a computer to access a database 
containing da posses to treatments, or other phenotypes, exhibited by individuals in a clinical population; 

(b) computer-readable pioy a ^ ^ , 1 , i i 3 it h 

xi vicuc i p ' . ^ i i c u on v •■ li p c >i< > present at 

the candidate locus: and (c) computer-readable program code for causing a computer to calculate the degree of 
c a t> ^ , ^ ^ r- - he pho-o I analysis of 

he apl° ^ 1 ^ 



75. The computer-usable medium oi claim 74, wherein the treatment comprises administration of a drug or drug 
candidate. 



76. The computer-usable medium of claim 74, wherein the candidate locus is a gene or a gene feature. 



77. i he computer-usable medium of claim 74, further comprising computer- readable program code stored thereon for 
causing a computer to store, display, or output ihe degree of correlation. 

78. The computer-usable medium of claim 74, further comprising computer- readable program code stored thereon for 
causing i computer to calculate he statistical significant orrelati< 



79. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to 
;c t > r ' N l c n ;n r 3 ,ic. ; >u= x xibi , c c c ic y iisoax o'rieresf 3 h in '< „n 3 
i ' n present at a candidate locus hypothesized to be associated with susceptibility to the condition 

or disease of Interest, or with a phenotype of Interest, the computer-readable program code comprising: (a) computer- 



readable program code for causing a computer to access hapiotype data for the candidate locus for each member of a 
population having the phenotype or cond c 1 e disease hapiotype data (b) computer-readable 

program cods; for causing a c v f ic siaiisiicaiiy n il the disease hapiotype data to calculate hapiotype or 
hapiotype oak frequencies: (c) computer-readable program code for causing a computer to access a 'database oer-iaiaing 
ki\ Jf t date locus ->i each member of a healthy reference population ("reference hapiotype data' 1 ); id) 
computer-readable program code tor causing a computer to statistically analyze the reference hapiotype data to calculate 
hapiotype or hapiotype- pair koc, .one . c e >ct'\'< c _c oi< i \ a i. \ ^ b > ; j: c >: >. < modify; 
correlation of a hapiotype- oi fi j , - i > n »^ilh ht , 

of interest, when the hapiotype or hapiotype pair has a higher frequency in the population having the phenotype, condition 
or disease of interest than in the reference population. 

80. The computer-usable medium of claim 79, wherein the candidate locus is a gene or a gene feature. 

81 . The computer-usable medium of claim 79, further comprising computer- readable program code stored thereon for 
causing a or t t 

82. The computer-usable medium of claim 79, further comprising computer- readable program code stored thereon for 
causing a computer to calculate the statistical significance of the correlation. 

83. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to predict 

-i . >l <. . .i.i. " oased on one or more selected hapiotypes or hapiotype 

pairs of the individual, the computer-readable program code comprising: ief > i n. > Mb . piogrsm code tor causing 
a computer to access a database of correlations between hapiotypes or hapiotype pairs and responses to the medical or 
pharmaceutical treatment In a reference population; (b) computer-readable program code for causing a computer to locate 
hapiotypes or hapiotype pairs in the database that match the selected hapiotype pairs oi the individual, and (c) computer- 

i j i k f c h c i > i f <. it 

associated in the database with the selected hapiotype or hapiotype pair. 

84. The computer- usable medium of claim 83, further comprising computer- readable program code stored thereon for 
causing a computer to generate an error estimate for the prediction, 

35 A computer-usable medium having computer-readable program code stored thereon, to; causing a computer to 

computer-readable pux . , ^ ^ v c t c = aL i ^ n 1 dif\i of the 

display device u i a i c ^ - . , - - . i h n > d r x j vi - or 

groupings oi a reference population; (b) computer-readable program code for causing a computer to retrieve from a 
database data indicative of the gene's structure and gene features: (c) computer- readable program code for causing a 
i. nn jut ! to < f ^ I ' i i f t i tuie i -e 

selectable items indicating the location of gene features, and graphical mdieaiors oi :he ieeaiion ot polymorphic sites on 
the gene; (d) computer-readable program code for causing a computer to display in a third area of the display device, in 
response to a user's selectee ot an item indicating a gene feature, a graphical representation of t structure oi the gene 

i >k , ■• 3 < -> ^ e Ml sle orogram 
iv i \ - . - -< e and display in a third area of the display device, in response to a 

user's selection of an item indicating the position of a polymorphic site, data indicative of the frequencies within the 
member groupings of the occurrence of particula . - >lvmorphic site. 

86. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to 
display on a display device haplc p jc x . iH j I i eJ gene or gene 

feature, the computer-readable p;ogram code comprising: ( - computer-readable program code for causing a computer < 
display on he d splay dev ce a ol ji nil \ t r < > to 3 polymorphic site in the gene or 

s c computer-readable program code for causing a compute o retrieve from a da 1 lay on the 

display device, in response to a user's selection of one or more items indicating polymorphic sites, individual hapiotype 
pairs in m . 3 ' tr > , < > i, > < n i > 1 ii r < k p a ' u 

r - - < > device data ndu 

within one or more member groupings within the population. 



87. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to 

display on a display devsce polymorphic sit« is- > ;gc- data for a gene or gene structure o "it I tp tei 'eadable 

i i i s \) j 3 ci >rlav Q£ vice 

one or r - i j i > t , < r ' , ,<■ , ' s - i i , - p - xjL — ^ c ' t*c 'im o 

gene feature nf ind wh n Tiatrb c ids to a dlfk population o ■; u 3; and 
(b) computer-readable program code for causing a computer io cS:&ptay on the display device, in each cell of a matrix 

structure a graphical indication o; degree 01 linkage between the f >l v 1 c c 3 <. s 1 > hs coordinates 
<■ 1 , 1 ■ na / 



83. T'f mpuier- usable medium of claim 87, wherein color i t plucai in v. c c ; 1 g< 

i.'ht ■■ f bo medium fu ther ' - v , ;m coco sto cc thereon for causing a computer to display 

a reference color scale relating color to degree of linkage. 



8b, A computer-usable medium having cop 
display on a display device a phylogeneilc tree, !he computer-readable program cods? comprising: (a) computer-readable 
program code fcr 1 1 i t i 

the gene or gens feature of interest; and (b) compi >g am code for causing a computer to display a 

phylogenetic tree structure having a node for each haplotype in a population, where the distance between nodes is 
proportional to the minimum number of nucleotides that would have to be changed to Interconvert the corresponding 



90. The computer-usable medium of claim 89, further comprising computer- readable program code stored thereon for 
causing a computer to display connections between the nodes that indicate 3 single nucleotide difference between the 
haplotypes repesenied by the nodes. 



91 . The computer-usable medium o; claim 89, further comprising compute! - readable program code slored thereon for 
causing a computer to display at each node an indication of the relative frequency of occurrence of the haplotype 
represented by the node . t * . v 



2 A c ^ incu me n -e-npuier to 

1 4I '\ „ . ^ . 1 ciir k 

'^c dcable orogrum 1. ode i 11 1 n 1 t 3 

1 t >r\ < fo 1 polymorphic site; (b) computer- 

readable program code for causing a computer to display on slur? display device a i / structure, therein ihe axes of the 
matrix sti hi. ^ 1 >\ elected 

from ihe iirst plurality > < b c t Items; and (e) computer-readable program code lor causing a computer 1 display on 
tht di r I / Ht v c n t i h r ; r 1 1 ! ir jf bility of t^e assignment to an 

individual of the haplotype pair corresponding to the coordinates of the ceil in the matrix, when the individual is genotyped 
only at the polymorphic sites selected from the second plurality of selectable items. 



33. The compute' u^abit in- <_l 1 l til luplotype 

pair assignment . and wherein ihe medium further comprises computer -readable program code stored thereon for causing 
a computer to display a reference color scale relating color to reliability of haplotype pair assignment. 



94. A cornouler us a oh nediom 1 3 >mputer 1o 

cispbv clinical 1 3 t m > I r the 

individuals in the population, the computer-readable program; code comprising: (a) computer-readable program code for 
causing a computer to retrieve from a computer-readable storage device, data representing haplotype pairs and clinical 
response values, or other phenotype data, for 1 subject population: and (b) (.1113 program code for 

u-ing 3 \ r - •< , , 1 r -ach of whose cells contains a graphical 

representation of the clinical response values or other phenotype data of individuals having the haplotype pair 
corresponding to the coordinates of that cell in the hapiotype pair matrix. 



95. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to 
i c f I iv> c n i ii c I t t f c c it ct t ! if n of 1 

5airsc r ndh c lais in the population for a gene or gene feature of interest, the computer-readable program 



-eadabie program cods for causing a computer to 

" ; f i - I ~l I" - — - i * — -i l « 1/- I II- Io II' S.I 1/ JltS 

' " e polymorphic sites, corresponding to the first 

. .. o ...i presentation of the mean clinical 
•electee • ^ \ hero of 




97. The computer-usable medium of claim 96 >'» . . jfo iia 

code stored thereon ior causing a computer io display a means for adjusting ;he range of mean clinical response values or 
other phenoiype data represented by the reference color scale: end \b) compute: -readable program code stored thereon 
for causing a computer, in response to the adjustment of the range of clinical response values or other phenoiype data 
represented by the reference color scale, to adjust the color of the cells of the hapiofype pair matrix. 

lerein the graphicai representation of data is a histogram indicating 

, . ^electable a ea n r outer-readable program code stored thereon 

for n ing a ompul i ) ;p jais having the haploiype pa presented by thi f1h < II in the 

matrix, a histogram indicating the distribution of the individuals across the range of clinical response values. 

100, The computer-usable medium of any one of claims 94.95, or 98, which further comprises computer-readable program 
code stored thereon for causing a computer to display a third selectable item, and computer-readable program code 
stored thereon for causing a computer to display, in response to selection of the third selectable item by the user, the 

t .tislical ' i y ii i n >j. , 
values. 

101. The compufer-usabie medium of any one of claims 94.95, or 96, which further comprise:; computer-readable program 
code stored thereon ior causing a computer to display a fourth selectable tern anc ^ - - ^ v In program code 
stored thereon for causing a computer to c \ ;spon select c bie item toy t user, the 

mericat mean £ c s >tponse values a x s. ,pe pair in the 

matrix. 



102 1 it < nr j t 3 i i c i nputer -readable f 

code stored thereon for causing a computer io display a fifth selectable item, and computer-readable program code stored 
thereon for causing a computer to display, in response to selection of the fifth selectable item by the user, the results of an 
a . l >. i l hi. i k " n x a i° jnse values between 

individuals having different hapiofype pairs is stafistioaffy significant.- 103. A <. r i Ji i medium having computer- 
readable program code stored thereon, for causing a computer to carry out a genetic algorithm for finding an optimal set of 
weights to fit a function of polymorphic site data for a gene or gene feature of interest to a li m response measurement, 
/dir code comprising: (a) computer-readable program code for causing a computer to display 
c i i jjbk program 

I computer- 

r r - e to display a variable controller tor setting the mutation rate parameter: (c) 
computer- readabie program code for causing a computer to display a variable controller for setting the crossover rate 
r ' -i\ s ipute * iek using a computer to display one or mor selectable items each 

corresponding to a polyn Mi -> t- rne if ^orm r j > gram code for 

causing a come . i or _ o of fc ^ . t j ■■ and (g) computer- 

readable program code for causing a compuier, in response to the selection by the user of one or more selectable items 



< o e po j polymorp! lection by u of the item for initiat of the gt 

j lie ih n Igoriihm cal t t « by the ccmroik c r: 

xi' h£ , , - \\ '- s > in k m > ' „' v ; <. ; c xh - a< ^ - - \ - . 

of the genetic algorithm calculation showing the optimal weights for each of the polymorphic sites. 

104. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to 
display on a display device correlations between 1 GOLcmec ome measur« 
e a oit m [ ^ ^ t ule U 
causing a computer to display ality ot tl 5 t a espondirsg to clinical outcome measurements; 7) (b) 

irn mt i i jI f i ' jbk ik ns com: spot 

clinical outc am 1 
plot of data points, each data point corresponding to an individus in th selected populatio v die ap er- readable 
t- >jran t i ? lpuier, in r< nse to selection by the user of an item from among the first plurality of 

selectable items, to locate each data paint along the x axis el the scatter plot according lo ihe clinical outcome value for 
the associated individual t lire clinical measurement represented by Ihe selected item; and 10; (e) computer-readable 
program code for causing the computer, in response to selection by the user of an item from among the second plurality of 

(intjfci i c c = >i )' ';:uk pk < i ii t c linn sic jtc ome value for 

ihe associated individual from the cl nu 

105. A comf I g a computer to 
provide information of use In conduct ng 3 , 1 , . < i c ma ,l notocol for a medical condition of Interest, the 
computer-readable program code comprising: (a) computer-readable program code for causing a computer to access a 
database of DMA sequence data for selected genes or other loo n ; epos i i and to access a 
database of (or accept t ss nput 

i -d .'tr , it ^ ro "i r x for causing a computer to assign to each member of the reference 

population haplotypes tor eacn or the selected genes or otner loci: jc; computer- readable program code km causing a 
computer to calculate the iiei. i . Ii nit in 

each o! I assigned haplotypes i the reiemnce population; m; computer-readable program cede lor causing a computer 
to assign to each memt < \ ,< t < > h i i n - ' I, i < h <~ < ibt 

frequencies, population distributions and statistical measures calculated In the reference population; (e) computer- 
readable program code for caua . „ L„ . i. V V. „ <- v 1 fl 

treatment and individual haplotypes, for each of ^ r r Lk f , i >i,lt m iahle program code for 

causing a computer to accept as input an individual's DNA sequence daia or haplotypes for one or more of the selected 
i. > i iLii'ei maoade c - . j. <. J j , n output the expected 

reap< - mii\ - - ^ a, ir-rn . t >c on the determined correla tie ; a t,v< >r i dividual responses to the 
treatment and Individual haplotypes. 

106. The con i it i I man 3 

sduced set 

or genoiypmg markers, which allow an individual's haplotypes to be accurately predicted without conducting a complete 
molecular hapioiype analysis; and . _ i 1 program code stored thereon for causing a computer to use the 

reduc ec c k . x , . 

107. A compute r-us lored thereon, for causing a computer lo inf 

rr polymorphic sites, the computer-readable program 
codecomprising: (a) computer-readable program code for causing a computer to access a database of m-siie haplc 
of the selected gene from a representative cohort of mc \ii_ „ i I i i s i <. causing a 

<.i)i io A i trie frequency of occurrence for each it the haplotypes; i >r t f 3i I program code for 
< sir in mi ^ ■> ; c ii mnt\p; ! r - if t - " -II i f k r i ' <- h \ k 

computer-reac code ioi causing a computer io calculate the expected frequency o? these genotypes 

asGjn imj -k ji cute' asa Jable c og am cooe for causing a computer to generate a 

i 1 ii i i n -\ r " " d Titity of the 

nucleotides at m-n polymorphic sites and admits the identity of nucleotides ai the other n sites; (f) computer-readable 
program code for causing a computer to for calculate, for each mask, how much ambiguity results from genotyping with 
only the n poly v t *y is admitted by ihe mask; (g) computer-readable program code for causing a 

compel > Jif i. f.ii r i > , v . >r c « c more masks 



108. The computer-usable medium of ciaim 107, which further comprises computer-readable program code stored thereon 
for causing a computer to calculate the leve! of ambiguity for a mask, the computer- readable program code comprising, (a) 

> ) it i c 3l program „ n causing a computer 1 , n! i pair;; c c n > f. - that are rendered identical by 
>t i 1 ^i causing a computer to calculate the geometric mean of 

the calculated Hardy- Weinberg frequencies of each pair of genotypes rendered identical by application or the mask; (c) 
computer-readable program code for causing a computer to sum a!! such geometric means for all ambiguous pairs to 
obtain an ambiguity score for the mask 

109. The computer-usable medium of claims 107 or 108, which further comprises computer-readable program code stored 
li i t i n i lit < , a, > yOi 3 c t t t i r it 
readable program code . - - i ,>r >l cadable program code for causing a computer to calculate, for two 

/pe pairs A and B that could oxoiain a y .e -1 gen^ , 
k P'\ f [> 1 f >. tl I i \. 

eornpris n g 5 0 an hi) it t in r k PA 

assigning the haptotype pair A; and (ill) it the number is greater than FA. assigning the haplotype pair B. 

1 1 0. A computer-usable medium having compute ^ i 3 computer to 
determine polymorphic sites or sub- hapiotypes that correlate with a clinical response or outcome of interest, or other 
phenotype, the computer-readable program code comprising: (a) computer-readable program code for causing a 
computer to access a database con;aiumg haptotype information, and cimroai response or ou;come data ;c!micai outcome 
values! or other phenorypo dara trom a cohort ot subjects; tb) computer-readable program code tor causmg a computer !o 
statistically analyze each individual SNP in the haplotype for the degree to which it correlates with the clinical outcome 
values or other phenotype d snd gs i 3 m i as ure of the degree of correlation; (c) computer-readable 
pii i h t ( hiii P li i n ncal no a ure if 
the degree of correlation with the clinical outcome values or other phenotype data exceeds a first cut-off value; (d) 
computer-readable program code for causing a computer to generate all possible pair-wise combinations of the saved 
SNPs so as to provide a set of n-site sub-haplotypes where n = 2; (e) computer-readable program code for causing a 

to n ink > . i t ill i < > i he 

clinical outcome values or other n mo:ype ; 3te ; ic c ii i x ns , ; of the degree of correlation; (f) 

computer-readable program code for causing a computer to store for further processing those n-slte sub-haplotypes 
whose rjm ii - „ - ot 'he degree of correlation exceeds the first cut-off value; (g) computer-readable program code 
tor causing a computer to generate a!! r " ' 5ie pa ,r combinations among and b<- > Use id SNPs and saved 
sub-haplotypes, to produce new subhapioiypes with increased values of n; (h) computer-readable program code for 
causing a oomoJtt to t n \ n » j 

further sub >l I | i t n tut 

1 t 1 ; i ( ode stored thereon 

> 3 » >l y t c \ ir i kq u 

correlation with i clinical outcome vaiue or other phenotype exceeds a r m cut- oft value, wherein h second cut-off 
value is greater than the first cut-off vaiue. 

112. A computer-usable medium having computer- readable program code stored thereon, for causing a computer to 
determine polymorphic sites or" sub- hapiotypes thai correlate with a clinical response or outcome of interest, or other 
> i >Ur i r. i! ]\ 1 1 t t it c m> 5 

computer to access a da:abase containing haptotype < i n and dimes: m -> m or outcome data ill n m 
values) or other phenotype data, from a cohort of subjects; (b) computer-readable program code for causing a computer to 

dividual SNP in the haplotype for the degree to which it correlates with the clinical outcome 
values: >r o her ohenotj r. t i t d bit program 

code for causing a cor - etc - t ir degmeof 

ed a fi st cut-off value, (d) compute! lea-a^. ~ < r '-u te j ~ — a - - f -> ., j- 

x SNPs so as to provide a set of n-site sub-haplotypes where n = 2; (e) 
computer-readable program code for causing a computer o s;atisiicaiiy analyse each newly generated n-site sub- 

, v r v t , > v i > , t t ' t t h i r 

p-value for the degree of correlation; (f) computer-readable program code for causing a computer to store for further 

oec whose p-vafue for the degree of correlation does not exceed the first cut-off value; 
ute - jram code for causing a computer to generate afl possible pair-wise combination* imong and 

between the saved SNPs and saved sub-haplotypes, to produce new si. - , • -m h increased values of n ; (hi 
computer-readable program code for causing a computer to repeat steps (e) through (g) until either (i) no new sub- 



haplotypes can be generated, or (II) no further sub-haplotypes having n less than a pre-selected or user -selectee: limit can 
be generated. 

1 13. T"v r\>n -vl i r iii h i J K t code stored thereon 

o J I l J i. 1 J f i f j t- ! 1 (. ^ 

She clinical outcome value or other phenotype does not exceed a second cut-off value, wherein !he second out-ott value Is 
less than the first cut-oft value 

114. The computer-usable medium of Ja m 110-113. which further comprises computer -readable program code stored 
lierer i n run m i Jt i t < i r n 
smaller sub-haplotypes, where the smaller sub- haplotypes each have correlation values that are at least as significant as 
that ot the complex sub- hapiotype. 

1 1 5. A computer-usable medium having computer-readable program code stored thereon, for causing a computer to 
determine polymorphic sites or , hapiolvpec that coueiate i a clinical response >\ outcome <> interest, or other 
phenotype o! interest, h i\ i i r k program code comprising: (a) compute: -readable program code for causing 
a computer to access a databa^ en nir o v. c or - ^ y c ^ ^ 
response, outcome i or other phenotype data from a cohort o! subjects; ft?) computer-readable program code for 

tai in i i n i i i i til \ 

response, outcome, or phenolype of interest, and to generals a numerical measure of the degree of correlation; (c) 
computer-readable program code for causing a computer to store for furthei processing 

il u i c ! M uot tot 

i in j . >r c -t c i ; ) ;< < , >ub hapiofypes having 

a single site masked, so as to provide a set of m-n site sub-haplotypes where n = 1; (e) computer- readable program code 
for causing a computer to statistically analyze each newly generated sub-hapiotype for the degree to which i; correlates 
with the clinical response, outcome, or phenotype of interest, and calculating a numerical measure of the degree of 
t uk h ion i i i i ( t f t t >< JO-- 

haplo types whose numerical measure of the degree of correlation exceeds the first cut-off value; (g) computer-readable 
program code for causing a computer to generate, from the saved sub-haplotypes, all possible sub- haplotypes having 
one additional I r k !e program code for cau ja mpuier k :pea; ; p 1 1 through (g) 

u-lil etl-^r i) no new i t a it i i ik IjjIIk 

sub-haplotypes having more unmasked sites than a pre-selected limit can be generated. 

1 ! ; coi 1 1 n J 5 - 1 ii omputei eacabh prograrr < c f i J 

or causing a comj: c ?-h votypes whose numerical measure of the degree of correlation with 

the clinical response data, outcome value, or other phenotype data exceeds a second f value, wherein the second 
cut-olf value is create; than the first cut- off value. 

1 1 7. A computer-usable medium having comput hers a computet 

dete'niEK i n f t t t h< i 

Ii r t program code compri n i < r >le program code for causing 

a computer to access a database containing single gene hapiotype information for one or more genes, and clinical 
response, outcome data, or other phenotype data from a cohort of subjects (b) computer-readable program code for 
causing a computer to statistically analyse each single gene hapiotype tor the degree to which If ir I i with t I iw i 
response, outcome, or phenotype of interest, and to calculate the p-value for the degree ofcorrelation; (c) computer- 

^ c e _ j j x 1 1 jj value for the 

degree of correlator* does not exceed a f cut -of: value; r computer- readable program code tor causmg a computer to 

- 1 i in i pc if /pes havi i i i 

so as to provide a set of m-n site sub-haploiypes where - 1 (e) computer-readable program code for causing a 
j tec spioiype fo e dtg ot c hich cc eaits with h 

response, outcome, or phenotype of ink r j t 1 1 n omputer- 

readable program code for causing a computer to save for further processing those sub haplofvp< < \ x for the 

3 js; (g) compute e code for causing a computer 

i rn the ; ed sub-hapbtypes, all possible sub- haplotypes a' >nal site masked; (h) computer- 

■ :ode for causing a computer to repeat steps (e) through (g) until < > hapiotypes h 

a p-value which does nol the first cut off value, or (ii) no further sub-haplotypes having more unmasked sites than a pre- 
selected limit can be generated. 



1 1 8. The computer-usable medium of claim 1 1 7. which further comprises computer-readable program code stored thereon 
for causing a computer to display those saved s b-hapk ; se p ue k the degree of correlation with me cilnicai 
esponse, outcome, or phenotyf: f i i < 
less than the first cut-off value. 

11- f >rnputer-usab nedium of claims 115-11 i er, co i mdabie prog - t stored 

thereon for causing a cor t 1 < , ^ on 

smaller sub-hap!otypes. where the smaller sub-haplotypes each; have correlation values that are a! leacr as significant as 
ihat of the complex sub- hapioiype. 

120. A computer programmed to cause hapiotype pair assignments to be made to an individual member of a population 
whose genotype information for a gene or gene feature of interest is stored in a computer-readable form, the computer 

c »sorior 

agram code stored in 1 o ^r^^ ~i ~o„ •> ■> " ond> for 

causing a computer to generate all possible hapiotype paii£ com i >ts 

p >otatr i i if - . . . , - .... 3 j to the 

i tuf "v mbt rq t n i! v based upon the observed Hi- 1 n i h r I / or hapiotype pairs in the population; and 
computer-readable program code for causing a computer to select the most probable hapiotype pair for the individual. 

121 . The computer of claim 1 20, wherein the program code further includes computer-readable program code for causing 
a computer to correct the stored distribution of haplotypes or hapiotype pairs for effects imposed by the presence of a 
limited "unL l J hi the population. 

122. The computer of claim 120, wherein the program code further includes computer-readable program code for causing 
a computer to validate hapiotype pair assignments by analyzing for compliance of the assigned hapiotype pair with 

M< ides* lane i pies 

123. The computer of claim 120, vv herein the population ;s selected horn the group coaching of a reference population, a 
clinical population, a disease population, an ethnic population, a family population and a same-sex population. 

1 3 pc 3 jlation 

* ho < n j! \ r in " |r ! h i i r 

comprising a memory having at least one region for storing computer executable program code and a processor for 
executing the program code i< program code includes: computer-readable program code for 

causing a computer to generate t possible hapioiype pairs consistent with t stored geaoiype, computer- readable 
program code f - ' , ^ - i.fii \p ,)ium\ Jit- j 1 n 

determine from the frequency data the probability, for each of the possible hapiotype pairs, that ;he Individual has the 
possible hapiotype pair; and computer-readable program code for causing a computer to select the most probable 
hapioiype pair for the individual. 

125. A computer programmed to identity a correlation between a clinical response to a treatment or other phenoiype and a 

p i 3piotype pair present at a candidate oci t ! J i h fi <• I ,c eofhoi 

phenotype, the computer comprising a memory having at least one region for storing computer executable program code 
and a processor for executing the program code stored In memory, wherein the program code includes; (a) computer- 
readable program code tor causing a computer to access a database containing data on cilnicai responses to treatments, 
or other phenofypes. exhibited by individuals in a clinical popular on c n " "I r rogi am code for causing a 

computer to access a database containing hapiotype data tor each Individual of the clinical population, the hapiotype data 
comprising information on a plurality of polymorphic sites present at the candidate locus; and (c) computer-readable 

i < h f l< , f di s tmd Ihe 

clinical response to the treatment or other phenotype, by statistical analysis of the hapiotype and clinical response data. 

126. The computer of claim 125, wherein the treatment comprises administration of a drug or drug candidate. 



127 ] >ei.oir t t -< k i c i k x i . gene oature. 



128, The computer or claim 125, wherein the program code further includes computer-readable program code for causing 
a computer to store, display, or output the degree of correlation. 



129. i he computer of claim 125, wherein the program cod f 

a computer to calculate If to' the co 'elation. 

130. A computer programmed to identify a correlation between an individual's susceptibility to a condition or disease of 

PIP t hi. i l X 1 L !. c J 

with uscepiibiiity t theconoi ->r > * f n i ru yr. : o m et he c -ipuvr compris ng a 

memory having at least one region for storing computer executable program code and a processor for executing the 
program code stored In memory, wherein the program code includes: fa) computer-readable program code for causing a 
computer to iotype data lo ^ e oi 

condition or disease ot inferos; ("disease haploiype i ; (b) compute: -readable program code tor causing a computer 
to statistically analyze the disease hnplolype data to calculate haplotype or haplotype pair frequencies; (c) computer- 
readable program code for causing a computer to access a database containing haplotype data for the candidate locus for 
each member o: a healthy reference population f reference haplotype data":: frit computer -readable program code for 
causing a computer r 0 statistically anaiytte rhe reference haplotype data ;o calculate: haplotype or haptorype pair 
frequencies: and (e) computer-readable program code for causing a computer to Identify a correlation of a haplotype or 
"aoletype paii h suae y to the disease or condition of interest, or with the phenotype of interest, when the 
haplotype or haplotype pair has a higher frequency in the population having the phenotype, condition or disease of interest 
than in the reference population. 

131 . The computer of claim 130, wherein the candidate locus is a gene or a gene feature. 

132. The computer of claim 130, wherein the program code further includes computer-readable program code for causing 
a compute;;- to store, display, or output the identified correlation. 

133 -r 1 1 >v jfogram code further include:; compuler-readabie program code tor causing 

a computer to calculate the statistical significance of the correlation. 



r r t t I < i It 3f 1 pa 

and (c) computer-readable 

program o I ^ . 1 ^ >. <. t i f , ^ . x « esponses 

associated m the database with the seieofed hapiorvoe ot haplotype pair 

135 134 h< urogram code further includes compuler-readabie program cede tor causing 

acoropjlt "n v -> - 

13S. Acompu programmed to ;'s structure and ot eatui a display devic snif 

comniisngam i i i ar storing computer executable program code and a processor for 

^ -i ^ ^ -> „ r mo , on o ^ me i I c 

frequencies of occurrence of a gene's hapiotypes i predetermined member groupings oi a tefetence f c n - o 
computer-readable program code for causing a computer to retrieve from a database data indicative of the gene's 
structure and gene features; (c) computer-readable program code for causing a computer to display in a second area of 

mentation of the gene's struc u ^ e 3 le^t i i j I k 

features, and graphical i dcao r > r ss on the gene; (d) computer- readable program code 

for causing a computer to display In a third area of the display device, in response to a user's selection of an item 
indicating a gene eature a gtc\. - •> <. 1 it e e_ re aving user- selectable items 

indicating the position of polymorphic sites; and (e) computer-readable program code for causing a computer to retrieve 
from a database, and display in a third area oi the display device, in response to a user's selection of an Hem indicating 



the position o a ndicative of the frequencies within the member groupings of the occurrence of 



137. A compu i ;e hapiotype pair frequency data within a population of 

:nd:y;di;sis, for s selected gene or gene feature, the computer comprising a memory having at least one region for storing 
c« f nits executable program coc i , e; i code :;k nernop, 

c to: causing a computer to display on ihe display device a 
.i if f a polymorphic site in the gene or gene feature; (c) computer- 

readable program code for causing a computer to retrieve from a database and display on the display device, in response 
to a user's selection of one or more items indicating polymorphic shea, individual haplotype paus m the database that differ 
3 i| if 31 moio if {-^ aolCf t r | ii - - v ^ i 'l o 

u ph h ^ - °d haplotype pairs within"one or more 



138. A computer programmed to display on a display device polymorphic site linkage data for a gene or gene structure of 
tut t " < f t. n f >r t it i t i r k f i c jmoute t exet jj abl< t. c 311 

a processc for ext < , , t - i il - ored m memoiy, vvhcrom the program code includes: (a) computer-readable 
program code for causing a computer to display on the display device one or more matrix structures, wherein the axes of 
each matrix structure represent Ihe polymorphic sites in the gene or gene feature of interest, and wherein each matrix 

tl i lit >j. 3 i I t ( > I f ( I , > . lid - . . I i . J I - - 1 C I C ilt , II 

computer to display on ihe display device, in each cct: n , 
between the twp polymorphic sites corresponding to the coordinates of the cell in ihe matrix. 



i 5 i 1 jn 3 | ! i t ! ii r i i 

medium further comprises computer-readable program code for causing a computer to display a reference color scale 
relaiing color to degree of linkage. 



140. A computer programmed to display on a display device a phyiogenetlc tree, the computer comprising a memory 

uii 3 I <. 1 3 the program code 

stored in memory, wherein the program code Includes: (a) computer- readable program code for causing a computer to 
display a plurality of selectable Items, each corresponding to a polymorphic siie In Ihe gene or gene feature of interest: 
and cc v <. v v ^ urogram code for causing a computer to display a phyiogenetic tree structure having a node for 
each haplotype in a population, where the distance between nodes is proportional to the minimum number , nucleotides 
thai would have to be changed to inierconvert the corresponding hapiotypes. 



14i i 14 he program code f in t , r ca 

e "odes that indicate a single lr dr> difference between t napk-.j 

repesented by the nodes. 

142. The computer of claim 140, wherein the program code further includes computer-readable program code for causing 
a computer to display ai each node an Indication of the relative freqr - - e of it apiolype repres J., 

the node among different population groups. 



143. A computer programmed to display a genotype ana s i puter comprising 

'ii io > - . i , < i)'fv! Jtinc the 

^ n ihr- ningran cv H i s f < 

-> ^ - - j \- Jn\^ - t - i it 

ns, each c ig k lorphic si i: f adable ,< r xk or ecus 3 a comf e 

to display on the 6\>a\ if i I f i t i nt h if 3yoe 5 inthe 

- , - •> w ' v - polymorphic sites selected ; r >k s ciabie Items; and 

(c) computer-re < a computer to display on the display device, in each cell of the matrix 

struciuie a giaf jn nent to an individual of the haplotype pair corresponding to 

j| ts genotyped only at the polymorphic sites selected from the 

second plurality of selectable Items, 



144. The computer of claim 143, wherein color is used as the graphical indication of reliability of haplotype pair 



assignment, and wherein wherein the program code further includes computer-readable program code for causing a 
f. i ' t i v 1 u hty of haploiype pair assignment. 



145. A computer programmed t c s other phenotype data, of a subject population as a 

function of haplotype pairs of the individuals in the population, the computer comprising a memory having at least one 
region for storing computer executable program cede and a processor fot executing the program code stored in memory, 
wherein the program codr % ,a ^ r u , r i r i , v - - leve from a 

compul read raj. € ponse values, or other phenotype 

data, for the subject population ax b) compute eac i 
haplotype pair matrix ! . a each ol whose ceiis contains a graphics; representation c trie clinical response values or 
other phenotype data of individual;; having the hapiofype pair corresponding o the coordinates or' h : ceii m the haplotype 
pair matrix. 



146. A computer programmed to display on a display device clinical response values, or other phnotypic data, of a subject 
pepel ation as a function of the haplotype pairs of the individuals in the population for a gene or gene feature of interest, 
he compu er o npri n j c ri t t c 1 t 1 

proceseoi 'or ^xecuf ng the prog - - - - - - ' ^ - - ~v r - ;c 

program code fc ca he 
gene of gene feature; (b) computer-readable program code for causing a computer to display one or more second 
selectable stems representing clinical measurements or phenotypes; and (c) computer-readable program code for causing 
a computer to display o , , - 1 1 1 1 r 1 3 i 

selectable items, a haplotype pair matrix structure, wherein the axes of the matrix structure represent haplotypes in the 
gene er gene feature of Interest that vary at the pelymotphic sites corresponding fe 1 1 eeiected Item or items, and 
II I 1 1 i 3 r I < 1 )ih< 1 

phenotype data, for the clinical measurement represented by the selected second item, of individuals hi v 3 % 
pair corresponding to the coordinates of the ceil in the haplotype pair matrix. 



147. The computer of claim 145 or 146, wherein color is used as the graphical Indication of mean clinical response value, 
or other phenotype data, and wherein the program code further includes computer-readable program code for causing a 
computer tot 4!), - nn xeta - < - c >r to mcae clinical response value. 



148. The computer of claim 1-17. wherein the program cods; iurther includes: (a) r t it program code for 

causing < ton I f ! 1 , < 3! h x , 1 r! 1 

jpresentf m code for causing a cc npi ei in 1 c 3 

istment of the range of clinical response values or other phenotype data represented by the reference color scale, 
to adjust the color of th . _i matrix 



149. The computer of claim 145 or 146, wherein the graphical representation of data is a histogram indicating the 
distribution of Individuals across the range of clinical respc . phs ata 



150. T e coir putt 3 > s ms 145,146, or 147, wherein at least one ceii in the displayed matrix includes a 
selectable area, and wherein the program code iurther includes computer- readable program code u oausmg a computer 
od splay '01 ndivicuals having the haplc > ^ ^ > < , > - < ,c < c 1 > 3 histogram 

indicating the distribution of the Individuals across the range of clinical response valuee. 



1 J ^ ■> \ , n - 1-o ' r i t h r I 

program code for causing a computer to display a third selectable item, and computer-readable program code tor causing 
a computer to display in esponse to seiecion of hi 0 h t 1 t 1 e t 

orrei 1 ion be ! 1 1 1 u 1 i t >n ii 



152. The computer of any one of claims 145,146, or 147, wherein the program code further includes computer-readable 
program code , , ji^ipi , r 1< 1 , , l L t i ; - ' - dt program code for 

play, in response to selection of the fourth selectable Item by the user, the numerical mean and 
standard deviation of clinical re-go-^o _ t_ cm t u 



,146, or 147, wherein the program code further includes computer-readable 
afift octal: ^ smput bie program cod ig 

of the ' \- c i c ib c item by the user, \ results n anaiysis oi variation 
-.mia m ii i i ^ ~c - , , b c ^ - x ^ , ^ ^ 



154. A computer programmed to carry n ng an oc imal sr r /eic j i function of 

polymorphic site data for a gene or gene feature of interest to a clinical response measurement, the computer comprising 
s memory having al least one region for storing computer executable program code and a processor for executing the 

ram cod lemoi idea: (a) nputer-readable p ogram coc or causing { 

co nputer io c api3, i ib t , ' ' ^ v 

readable program code for causing a computer to display a variable controller for setting the number of agents parameter; 
(c) computer-readable program code for causing a computer to display a variable controller for setting the mutation rate 

r r ic I i i i t. i >< Mr 2 the 

nr ' r i t g - x h niogram code for causing a computer to display one or more selectable 

code tor oauamg a computer to- ciisolavmq a selectablelrem ;oi ; mi;ia;ron ol the geriotic algorithm calculation, and (qi 

comoutcr loadable pror - x or i e erci- n -x-m t , imi 

selects e items c ?r of me iter genetic 

iU mt m hIi b j j it i. il rv 

ill , - - - x . - (. t sen Shn 
generations, and {»} the results of the genetic algorithm calculation showing the optimal weights for each of the 
polymorphic sites. 



I f A compute xogrammec , c hnical outcome values obtained from 

selected clinical outome measures for a selected population, the computer comprising a memory having at least one 
region for storing computer executable program code and a processor for executing the program code stored in memory, 
wherein e program code includes: m {3} compute: -readable program code for causing a computer to 3 o! g a 1 > 
plurality of selectable items corresponds; ** xisureme 12} (b ' < i< Jble program code for 

causing a computer to display a second plurality of selectable Items corresponding to clinical outcome measurements: and 
13) (c) computer-readable program code for causing a computer to display a scatter plot of data points, each data point 
corresponding to an individual in the selected popuiaiion; ! id) computer- readable program code for esusmg 
computer, in response to selection by the user of an item from among the first plurality of selectable items, to locate each 
data point along the x axis of tine scatter plot according to the clinical outcome value for the associated individual from the 
clinical measoremen ep x;;ec £ ed rm and 15) Is >mpuier adable pro< n code for can ng the 

computer, in response to selection by the user of an Item from among the second plurality of selectable items, to locate 
each data com; along toe v avis el the scatter relet according to the clinical outcome value for the associated individual 
from he clinical me he =>tiected item. 



15S. A computer programmed to provide information of use in condi c! satmeni p otocol for a 

medical condition oi interest, the computer comprising a memory having at least one region tor storing computer 
oa utjoh p'xjratr i J r. c >c f n 1 1 n 1 

includes- la; computer- readable program code tor causing ; computer to access ; darabase or I \ \ sequence data for 
selected genes or other loci In a reference population of individuals, and to access a database of for accept as Input! DMA 
sequence data for selected t v. > t k y r 0 i n 

ide fc jplotypes for each ol tlx elected 

genes or other loci; (c) computer-readable program code for causing a computer to cai h encies, population 

a . * . s J me „ lie v l ^ 3 j r t „ ^"i" 1 

population; hit comparer -readable program code for causing a comparer to assign to- each member ol a iua! population 

measures calculated in the reference population; (e) computer- readable program code for causing a computer to 
determinine the correlations between Individual responses io the treatment and individual hapio types, for each of the 

it !' X < !< II > 1 3 f <. f c i Hit !( i c I 

DMA sequence data or haplotypes for one or more of the selected genes or other loci < - 01 cad^olo nngram 

code for causing a computer to display or output the expected response of the individual to the treatment, based on the 
determined co' Jua espo ses tc he eatrne t and ndi dual aplotypes. 



rter of claim 156, wherein the program code further includes: (a) computer-readable program code for 



causing a computer to derive from the hapiotype distribution found for the reference population a reduced set of 

s r h ? n i. a 3 haploty 'pes to be accurately predicted without conducting a complete 

! r ' i t i k > r ) c <. } 1 it hi c c 

genotype markers to assign haplotypes. 

153. A computer programmed to inter genotypes o; individual subjects lor a selected gone having a; teas; m polymorphic 
^ I if ' C-g 1 ! ' XI ' ii ! 7,1 - , h i 0 . . - ' o >. ' i , 

>i ec ig jrogrs i code stored in memory, wherein the program code includes: (a) computer-readable 
program code for causing a computer to access a database of m-site haplotypes of the selected gene from a 

> . N N , c . . , j i. r 1 N 1 , 1 x i c ion ,< 

, x um < r - i to construct a list of 

ail genotypes that could result from all possible pairs of observed haplotypes; (d) computer-readable program code for 
causing a computer to calculate the expected frequency of x genotypes assuming the ^ v\ equilibrium; I x 

l n i it i it i is 3 c r ( i i i i II >< ,ibip rru>k£ of the .ame 

length n as the haplotypes, wherein each mask blocks the identity oi the nucleotides at m-n polymorphic sites and admits 
the identity of nucleotides at the other n sites; (f) computer-readable program code for causing a computer to for calculate, 
tor each mask, how much ambiguity results from genotyplng with only the n polymorphic sites whose identity is admitted 
by the m c k iij,i ' xp r r - - )i t -xj a computer to output or display on a display device ihe 

calculated ambiguity tor one or more masks. 

1 9 f -> (.orrojtr ot claim 1 

a computer to calculate the level of ambiguity for a mask, the computer-readable program code comprising: (a) computer- 
readable program code for causing a computer to identify all pairs of genotypes that are rendered identical by application 
of the mask: ibi computer -readable program code 'or causing a computet to calculate the geometric mean of the 
calculated Hardy- Weinberg frequencies of each pair of genotypes rendered identical by application of the mask; (c) 
computer-readable program code for causing a computer to sum a!! such geometric means for all ambiguous pairs to 
obtain an ambiguity score for the mask. 

160. The computer of arty one of clanio 1 . J 5 
program code for causing a computet to assign a haploiype pair to an individual havmg v ambiguous; genotype, the 
computer-readable program code comprising: ;a; computer- teadable program code for causing a computer to calculate, 
for two haplotypp- put 

Pb * 't.t'A • " ' i k. „ v. v. i <. j y ^ a hapiotype pair by a 

an or equal to PA, 

» onu ^ ^ > xi if xjc xr > c, eater than PA. assigning the hapiotype pair B. 

161. A computer p' ^ i ii i » ponse or 
outcome of interest, or other pheno yoc hr --or i 3 < nj 
computer executable program code and a processo ; s program code stored In memory, wherein the 
program code includes: (a; computer-readable ptograrti ;:ode tor causli'ig a computer to access a database comaming 

3 iU j.r ink if i 1 t s in f < r 10m 3 

coho cf cub "-cl o omo A 1 iM 1 . \ , - ii 3 

SNP in the hapiotype tor the degree to which ii correlates with the clinical outcome values or other phenotype data, and 
generating a numerical measure of ihe degree oi correlation; to; 1 v t ! > it J r code for causing a computer 

to store for further procr ^ > N P , v a numerical measure of Ihe degree of correlation with the 

clinical outcome values or o'i ? 3a a exceeds a first cut-off value; (d) computer-readable program code for 

causing a computer to generate all possible pair-wise combinations of the saved SNPs so as to provide a set ot n-siie sub- 
haplotypes where 1 ^ 2: iei computer-readable program code lot causing a computer > statistically analyze each newly 
generated n-sife sub-haploUpo }o mcri-o -> t ! 1 ir 1 xr phenotype 

data, and calculate a numerical measure of the degree of correlation; (!) computer-readable program cede for causing a 
computer to store for further processing those n-site sub-haplotypes whose numerical measure of the degree of 
,ormh ion exceco 1hofiista.il lue r 1 ! tun 1 it gen jr .it jail 

v > c ^ 1 ^ x rf -^en the saved SNPs and saved sub-haplotypes. to produce new 
bhapi v £ ssed values of n ; (h) computer-readable program code for causing a computer to repeat steps 

(e) through (g) until either (i) no new sub-haplotypes can be generated, o 
v c ; ' oe generated 



162. The compute, of claim 161, wherein ihe program code further includes computer-readable proo/am code for causing 
a computer to display those saved SNPs and sub-haplotypes whose numerical measure of the degree of correlation wsth 
lu c cat ' i a -v a ' i c i )' >U f c ^ c ; < n ; f , J re ' hena c ; 3 .vie - < 

■hart J he fir;.; cui-ofi value 



5NP in the hapiotype for the degree to which if correlates with the clinical outcome values or other phenoiype data, and 
.alculate the p-vasue tor the degree of correlation; (c) computer-readable program code for causing a computer to store for 
urther proc hi < IPs win va rth jree of correlation d< : o >< H . is cui >f aluo 

a ' or > - dable pro i ;.s;i x cat na omoul - ■ fe ail reossibie j 1a on? < h 

5NPs so as to provide a set of n-shr «L l 2 e) computer-readable program code for causing a 

omout , to statistically analyze each newly generated t - :;ub-hapio;voe tor the degree to which it correlates with the 
;!inica! ouicorr - rlur crolh-i n n 

eadable program code for causing a computer to store for further processing those n-site sub-haplotypes whose p-vaiue 
or ft degree of correlation doe:; not exceed trie fire; cut-off value; |gl computer-readable program code for causing a 

; n >ui to generate ai: possibie pair-wise combination:; among an;; between the caved SNPc and caved ,ui 
iaplo types, to produce new subhapb types wilh increased values of n; (h) computer -readable program code for causing a 
;omputer to repeat steps {&) through (g) until either fi) no new sub-haplotypes can be generated, or (ii) no further sub- 
lapiotypes having n less than a pre-selected or user-selected limit can be generated. 



164. The computer of claim 161, « e program code further includes computer-readable program code for causing 

a computer to display those saved SNPs and sub-haplotypes whose p-value for the degree of correlation with the clinical 
outcome value or other phenotype does no: exceed a second cut-ol; value, wherein ihe second cut-oil value i less than 
the first cut-off value. 



3x subhaplo; 



166. A computer programmed to determine polymorphic sites or sub- hi 
xilcorr Merest, ore >iype c erect, ihe jmputer compr 

storing computer executable program code and a processor for executit 
ihe program code includes: (a) computer-readable program code for ca 
single gent ii ^ >. .te or more genes, and clinical re 

from a cohort of subjects: bi computer-readable program code for cau? 
gene hapiotype tor the degree to which it correlates , h ff;e ci:nicai t 



numerical 



of the degre 
hose haplolyp- 



■ correlation; ( 
i who; 



• ■ mp ; i 
: ;a! rr 



and calculating a numerical me; 
computer to save for further pro- 
exceeds the first cut-off value; (; 
sub-haplotypes, all possible sue 
causing a computer to repeat si 
which exceeds the first cut-off v. 
Simit can be generated. 



adable program code k 
to which it correlates wi 
he degree of correlate: 
hose sub-haplotypes w 
>. t mi-, 
pes having one additio 
Trough (gt until either (I 



stop 
imputer 



ialyj 



wly 



c 3 i ion >u com < phe ;type c 

i - v t i - \i e v - f a 
nericai measure of the degree of correlation 
... g a computer to generate, from the saved 
in.- (ii i 3 r f i - , ' 
sub-hapiotypes have a degree of correlation 
>r (ii) no further sub-haplotypes having more unmasked sites than a pre-selected 



167. The computer of claim 166, wherein ihe program code further includes computer-readable program code for causing 



a computer to display those saved sub-haplotypes whose numerical measure of the degree of correlation with the clinical 
response data, outcome value, or other phenoiype data exceeds a second cut- off value, wherein the second cut-off value 
> c c t han the hrs i c v h < 





or other phenofy? 




ecutable program 




eludes: (a) compi 


; hapiofy 




on or auoteots a; tcfius;;: 


type for 




e for the degree ot 



ec nore stent and ; 5type dei; 

r- readable program code lor causing a computer lo statistically analyse each single 

1 i ^ is*. c 0 "o se outcome <rp c otype oi interes and to 

correlation; (c) computer-readable program code for causing a computer to store for 
further processing those haplotypes whose p-vaiue for the degree of correlation does not exceed a first cut-off value; (d) 
computet r it i ( i . 

sites, ai! possible sub- haplotypes having a single she n ked so as to provide a set of m-n siu r. - where n = 

1 ; (ei comouter-readabie program code for causing a computer to statistically analyze each newly generated sub- 

, i f. k . .I < « r„ c — , p ft, 31 

calculating the p-value to; the degree of correlation; (?) computer-readable program code for causing a computer to save 
for further processing those sub-haplotypes whose p-va!ue for the degree of correlation does not exceed the first cut-off 
value; ;g) computer-readable program code tor caning a computer to generate, irorn the saved sub-haplotypes, ail 

: M)t\e: i, n n x . < c m < v 5 " :omputer-readabie program code for causing a computer 
to ropoaf \e f < i I in n -&n sub-haplotypes have a p-value which does not the iirsl cut-off value, 

or (ii) no further sub-haplotypes having more unmasked sites than a pre-selected limit can be generated. 



169. The computer of claim 1o' , * > h nogram code further includes computer-readable program code for causing 
a computer to display those saved sub-haplotypes whose p-value for the degree of correlation with the clinical response, 
outcome, or phenotype of interest does not exceed a second cut-off value, wherein the second cut-off value Is less than 
the first cut-off value. 



170 1 'com o ik of s -i c nt ' i it r 

our (<- J i ^ \ , 3 t I I 1rd f r n 

smaller sub-haplotypes, where the smaller sub-haplotypes each have correlation values that are ai least as significant as 
that of the complex sub- haplotype. 



1 71 . A data structure for storing and oi 1 d on a computer-readable medium and 

accessible by a processor, which comprises a single patent fable which is adapted for storing, organizing, and retrieving a 
>l ji iiU jf j£ nHi f - . . . h ti i i 



172. The data structure of claim 171 writ. - i ■> -"It , «j , . - of three submodels comprising the data 

structure whereir JC i ;ubmod I 3 3 1 1 

repository submodel. 



1 /3. The data structure of claim i 1 1 c s>s ing of 

chromosome s 3 ons genes, gene regions, gene transcripts, transcript regions, and polymorphisms. 



174. The data structure of claim 173, further comprising a clinical repository submodel. 



1 /5. The dai; r ; c m 1 M, further cornpn g re? 'I 



176. A method for ston x mz orqa' t hi i <. 3 j >mpriing 

a single parent table which Is adapted for storing, organizing, and retrieving a plurality of genetic features by the relative 
positional relationships between the genetic features; and (b) positioning a first genetic feature onto a second genetic 
feature. 



177. The method of claim 175. wherein said first genetic feature is an assembly and said second genetic feature is a gene. 



178. The method of claim 1 77, further comprising positioning a third genetic feature onto said gene. 

179, The method of claim 178, wherein said third genetic feature is a gene region and the method further comprises 
positioning onto said gene region a polymorphism. 

1 f t l in 1 l t i t -en the polymorphism and a; leas; one 

phenotype which is associated with the polymorphism. 

181. The method of claim 177. k r g portioning ont < pe which comprises a plurality of 
polymorphisms. 

182. The method of claim 1 78, further comprising providing a relationship between the hapiotype and at least one 
phenotype which is associated with the hapiotype. 

183. A data structure for storing and organizing biological information, stored on a computer-readable medium and 
accessible by a processor, which comprises at least two different fields, one of which includes a plurality of genetic 
features, and the >t i of which includes i< I i positional relationships belween the genetic features. 
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